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A method for the reconstruction of the primordial density fluctuation field is presented. Various 
previous approaches to this problem rendered non-unique solutions. Here, it is demonstrated that 
the initial positions of dark matter fluid elements, under the hypothesis that their displacement is 
the gradient of a convex potential, can be reconstructed uniquely. In our approach, the cosmological 
reconstruction problem is reformulated as an assignment problem in optimisation theory. When 
tested against numerical simulations, our scheme yields excellent reconstruction on scales larger 
than a few megaparsecs. 

I. INTRODUCTION 

The present distribution of galaxies brought to us by redshift surveys indicates that the Universe on large scales 
exhibits a high degree of dumpiness with coherent structures such as voids, great walls, filaments and clusters. The 
cosmic microwave background (CMB) explorers, however, indicate that the Universe was highly homogeneous billions 
of years ago. When studying these data, among the questions that are of concern in cosmology are the initial conditions 
of the Universe and the dynamics under which it grew into its present form. CMB explorers provide us with valuable 
knowledge into the initial conditions of the Universe, but the present distribution of the galaxies opens a second, 
complementary window into its early state. 

Unraveling the properties of the early Universe from the present data is an instance of the general class of inverse 
problems in physics. In cosmology this problem is frequently tackled in an empirical way by taking a forward approach. 
In this approach, a cosmological model is proposed for the initial power spectrum of dark matter. Next, a particle 
presentation of the initial density field is made which provides the initial data for an N-body simulation which is 
run using Newtonian dynamics and is stopped at the present time. Subsequently, a statistical comparison between 
the outcome of the simulation and the observational data can be made, assuming that a suitable bias relation exists 
between the distribution of galaxies and that of dark matter. If the statistical test is satisfactory then the implication 
is that the initial condition was viable; otherwise one changes the cosmological parameters and goes through the whole 
process again. This is repeated until one obtains a satisfactory statistical test, affirming a good choice for the initial 
condition. 

However, this inverse problem does not have to be necessarily dealt with in a forward manner. Can one fit the 
present distribution of the galaxies exactly rather than statistically and run it back in time to make the reconstruction 
of the primordial density fluctuation field? Since Newtonian gravity is time-reversible, one would have been able to 
integrate the equations of motions back in time and solve the reconstruction problem trivially, if in addition to their 
positions, the present velocities of the galaxies were also known. As a matter of fact, however, the peculiar velocities 
of only a few thousands of galaxies are known out of the hundreds of thousands whose redshifts have been measured. 
Indeed, one goal of reconstruction is to evaluate the peculiar velocities of the galaxies and in this manner put direct 
constraints on cosmological parameters. 

Without a second boundary condition, reconstruction would thus be an infeasible task. Newton's equation of motion 
requires two boundary conditions, whereas, for reconstruction so far we only have mentioned the present positions of 
the galaxies. The second condition is the homogeneity of the initial density field: as we go back in time the peculiar 
velocities of the galaxies vanish. Thus, contrary to the forward approach where one solves an initial-value problem, in 
the reconstruction approach one is dealing with a two-point boundary value problem. In the former, one starts with the 
initial positions and velocities of the particles and solves Newton's equations arriving at a unique final position and 
velocity for a given particle. In the latter one does not always have uniqueness. This has been one of the shortcomings 
of reconstruction, which was consequently taken to be an ill-posed problem. 

In this work, we report on a new method of reconstruction which guarantees uniqueness (Frisch et al. 2002). 
In Section II, we review the previous works on reconstruction. We describe our mathematical formulation of the 
reconstruction problem in Section III and show that the cosmological reconstruction problem, under our hypotheses, 
is an example of assignment problems in optimisation theory. In Section IV, we describe the numerical algorithms to 
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solve the assignment problem. In Section V, we test our reconstruction method with numerical N-body simulation 
both in real and redshift spaces. Section VI contains our conclusions. 



II. A BRIEF REVIEW OF VARIATIONAL AND EULERIAN AND LAGRANGIAN APPROACHES TO 

RECONSTRUCTION 



The history of reconstruction goes back to the work of Peebles (Peebles 1989) on tracing the orbits of the members 
of the local group. In his approach, reconstruction was solved as a variational problem. Instead of solving the 
equations of motion, one searches for the stationary points of the corresponding Euler-Lagrange action. The action 
in the comoving coordinates is (Peebles 1980) 
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where summation over repeated indices and j ^ i is implied, to denotes the present time, the path of the ith 
particle with mass m, is Xj(i), pb is the mean mass density, and the present value of the expansion parameter a(t) is 
a = a(to) = 1. The equation of motion is obtained by requiring 
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The boundary conditions 
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would then eliminate the boundary terms in (3). 

The components a = 1,2,3 of the orbit of the ith particle are modeled as 
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The functions /„ are normally convenient functions of the scale factor a and should satisfy the boundary conditions 
(4). Initially, trial functions such as /„ = a n (l — a) or /„ = sin(n7ra/2) with /o = cos(7ra/2) were used. The 
coefficients Ci tTl are then found by substituting expression (5) in the action (1) and finding the stationary points. 
That is for physical trajectories 
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In his first work, Peebles (1989) considered only the minimum of the action, while reconstructing the trajectories 
of the galaxies back in time. In a low-density Universe, assuming a linear bias, the predicted velocities agreed with 
the observed ones for most galaxies of the local group but failed with a large discrepancy for the remaining members. 
Later on, it was found that when the trajectories corresponding to the saddle-point of the action were taken instead 
of the minimum, much better an agreement between predicted and observed velocities was obtained, for almost all 
the galaxies in the local group (see Fig. 1). Thus, by adjusting the orbits until the predicted and observed velocities 
agreed, reasonable bounds on cosmological parameters were found (Peebles 1989, 1990). 
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FIG. 1. A schematic demonstration of Peebles' reconstruction of the trajectories of the members of the local neighbourhood 
using a variational approach based on the minimisation of Euler-Lagrange action. The arrows go back in time, starting from 
the present and pointing towards the initial positions of the sources. In most cases there is more than one allowed trajectory 
due to orbit crossing (closely related to the multistreaming of the underlying dark matter fluid). The pink (darker) orbits 
correspond to taking the minimum of the action whereas the yellow (brighter) orbits were obtained by taking the saddle-point 
solution. Of particular interest is the orbit of N6822 which in the former solution is on its first approach towards us and in 
the second solution is in its passing orbit. A better agreement between the evaluated and observed velocities was shown to 
correspond to the saddle point solution. 

Although rather successful when applied to catalogues such as NGB (Tully 1988), reconstruction with such an aim, 
namely establishing bounds on cosmological parameters using measured peculiar velocities, could not be applied to 
larger galaxy redshift surveys, containing hundreds of thousands of galaxies for the majority of which the peculiar 
velocities are unknown. For large datasets, it is not possible to use the velocities, to choose the right orbit from the 
many which are all physically possible. In order to resolve the problem of multiple solutions (the existence of many 
mathematically possible orbits) one normally had to do significant smoothing and then try the computationally costly 
reconstruction using Peebles variational approach (Shaya et al. 1995, Branchini et al. 2001). However, one was still 
not guaranteed to have chosen the right orbit (see Hjorteland 1999 for a review of the action variational method). 

The multiple solution can be caused by various factors. For example, the discretisation in the numerical integrations 
(7) can produce spurious valleys in the landscape of the Euler-Lagrange action. However, even overcoming all these 
numerical artifacts one is still not guaranteed uniqueness. There is a genuine physical reason for the lack of uniqueness 
which is often referred to in cosmology as multistreaming. Cold dark matter (CDM) is a collisionless fluid with no 
velocity dispersion. An important feature that arises in the course of the evolution of a self-gravitating CDM Universe 
is the formation of multistream regions on which the velocity field is non-unique. These regions are bounded by 
caustics at which the density is divergent. At a multistream point where velocity is multiple- valued, a particle can 
have many different mathematically viable trajectories each of which would correspond to a different stationary point 
of the Euler-Lagrange action which is no-longer convex. 

In addition to the variational approach to reconstruction, there is the well-known POTENT reconstruction of the 
three-dimensional velocity field from its radial component (Dckcl et al. 1990). The POTENT method is the non- 
iterative Eulerian version of the original iterative Lagrangian method (Bertschinger and Dekel 1989) and assumes 
potential flow in Eulerian space. POTENT finds the velocity potential field by integrating the radial components of 
the velocities along the line of sight. Since this is an Eulerian method, its validity does not extend much beyond the 
linear Eulerian regime. 

In following works, with the use of POTENT-reconstructed velocity field or the density field obtained from the 
analysis of the redshift surveys, the gravitational potential field was also reconstructed (Nusser and Dekel 1992). 
The gravitational potential was then traced back in time using the so-called Zel'dovich-Bernoulli equation which 
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combines the Zel'dovich approximation with the Eulcr momentum conservation equation. Later on, it was further 
shown (Gramann 1993) that the initial gravitational potential is more accurately recovered using the Zel'dovich- 
continuity equation which combines the Zel'dovich approximation with the mass continuity equation (Nusser et al. 
1991). The Eulcrian formulations of Zel'dovich approximation was also extended to second-order and it was found 
that this extension does not give significant improvement in the reconstruction (Monaco and Efstathiou 1999, see 
Sahni and Coles 1995 for various approximation schemes). 

However, these procedures are valid only as long as the density fluctuations are in the Eulerian linear or quasi-linear 
regimes (defined by \(p — pb)/pb\ < 1)- They do not robustly recover the initial density in regions of high density when 
the present-day structures are highly nonlinear (|(p — pb)//>b| ^ !)■ Therefore, they require that the final density 
field be smoothed quite heavily to remove any severe nonlincarities, before the dynamical evolution equations are 
integrated backward in time. 

Here, we describe a new method of reconstruction (Frisch et al. 2002) which not only overcomes the problem of 
nonuniquencss encountered in the least-action-based methods but also remains valid well beyond the linear Eule- 
rian approximations employed in many other popular reconstruction methods, a few of which have been mentioned 
previously. 



III. MONGE— AMPERE— KANTOROVICH RECONSTRUCTION 



Thus, reconstruction is a well-posed problem for as long as we avoid multistream regions. The mathematical 
formulation of this problem is as follows (Frisch et al. 2002). Unlike most of the previous works on reconstruction 
where one studies the Euler-Lagrange action, we start from a constraint equation, namely the mass conservation, 



p(x)dx = p (q)dq 



(8) 



where po(<l) is the density at the initial (or Lagrangian) position, q, and p(x) is the density at the present (or Eulerian) 
position, x, of the fluid element. The above mass conservation equation can be rearranged in the following form 
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where det stands for determinant and po(q) is constant. The right-hand-side of the above expression is basically given 
by our boundary conditions: the final positions of the particles are known and the initial distribution is homogeneous, 
Po(q) = const. To solve the equation, we make the following two hypotheses: first that the Lagrangian map (q — > x) 
is the gradient of a potential <& and second that the potential $ is convex . That is 



x(q,t) = V,$(q,t) . 



(10) 



The convexity guarantees that a single Lagrangian position corresponds to a single Eulerian position, i.e., there has 
been no multistreaming 1 . These assumptions imply that the inverse map x — > q has also a potential representation 



q= V x 0(x,t) 
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where the potential O(x) is also a convex function and is related to <&(x) by the Legendre-Fenchel transform (e.g. 
Arnold 1978) 
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The inverse map is now substituted in (9) yielding 
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1 The gradient condition has been used in previous works (Bertschinger and Dekel 1989) on the reconstruction of the peculiar 
velocities of the galaxies using linear Lagrangian theory. 
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which is the well-known Monge-Ampere equation. The solution to this 222 years old problem has recently been 
discovered (Brenier 1987, Benamou and Brenier 2000) when it was realised that the map generated by the solution 
to the Monge-Ampere equation is the unique solution to an optimisation problem. This is the Monge-Kantorovich 
mass transportation problem, in which one seeks the map x — > q which minimises the quadratic cost function 

1=1 p (q)|x-q| 2 d 3 g= f p(x)|x - q\ 2 d 3 x . (14) 

J q J x 

A sketch of the proof is as follows. A small variation in the cost function yields 

81 = { [2p(x)(x-q) -Sxl^x , (15) 

J X 

which must be supplemented by the condition 

V x • (p(x)Sx) = , (16) 

which expresses the constraint that the Eulerian density remains unchanged. The vanishing of SI should then hold 
for all x — q which are orthogonal (in L 2 ) to functions of zero divergence. These are clearly gradients. Hence x — q(x) 
and thus q(x) is a gradient of a function of x. 

Discretising the cost (14) into equal mass units yields 

1 = ffi (j2 - x *) 2 ^) • 

The formulation presented in (17) is known as the assignment problem: given N initial and N final entries one has to 
find the permutation which minimises the quadratic cost function (see Fig. 2). 
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FIG. 2. Solving the reconstruction problem as an assignment problem. An example of a system of N = 9 particles is sketched. 
The cost matrix is shown. For example the entry C32 is the cost of getting from the Lagrangian position q2 to the Eulerian 
position X3. The cost of this transport is C32 = (q2 — X3) 2 . The total cost, Ct of a given permutation is the sum of all such 
costs which are chosen for that permutation. That is Ct = Generally, there are TV! possible permutations. Only one 

of these TV! permutations is the optimal assignment: C op timai = mhij^) (Ct). For this system there are 362,880 different costs 
possible, each obtained by a different permutation. Algorithms with factorial complexity are clearly impractical even for small 
systems. However, assignment algorithms have complexities of polynomial degrees. 



5 



IV. SOLVING THE ASSIGNMENT PROBLEM 



If one were to solve the assignment problem for N particles directly, one would need to search among N\ possible 
permutations for the one which has the minimum cost. However, advanced assignment algorithms exist which reduce 
the complexity of the problem from factorial to polynomial (so far for our problem as verified by numerical tests 
for N < 20,000, e.g., Burkard and Derigs 1980 algorithm has a complexity of N 3 - e for a constant-volume sampling 
of particles and a complexity of TV 2 8 for a constant-density sampling. The Bertsekas 1998 auction algorithm has a 
complexity of TV 2 2 in both cases). Before discussing these methods, let us briefly comment on a class of stochastic 
algorithms, which do not give uniqueness and hence should be avoided. 

In the PIZA method of solving the assignment problem (Croft & Gaztahaga 1997), initially a random pairing 
between N Lagrangian and N Eulerian coordinates is made. Starting from this initial random one-to-one assignment, 
subsequently a pair (corresponding to two Eulerian and two Lagrangian positions) is chosen at random. For example, 
let us consider the randomly-selected pair xi and X2 which have been assigned in the initial random assignment to qi 
and q2 respectively. Next one swaps their Lagrangian coordinates and assigns xi to q2 and X2 to qi in this example. 
If 

[(xi - qi) 2 + (x 2 - q 2 ) 2 ] > [(xj - q 2 ) 2 + (x 2 - Ql ) 2 ] , (18) 

then one swaps the Lagrangian positions, otherwise, one keeps the original assignment. This process is repeated 
until one is convinced that a lower cost cannot be achieved. However, in this manner there is no guarantee that the 
optimal assignment has been achieved and the true minimum cost has been found. Moreover, there is a possibility 
of deadlock when the cost can be decreased only by a simultaneous interchange of three or more particles, while the 
PIZA algorithm reports a spurious minimum of the cost with respect to two-particle interchanges. Results obtained 
in this way depend strongly on the choice of initial random assignment and on the random selection of the pairs and 
suffer severely from the lack of uniqueness (see Fig. 3). 

In spite of these problems associated with PIZA-type algorithms in solving the assignment problem, it has been 
shown (Narayanan and Croft 1999) in a point-by-point comparison of the reconstructed with the true initial density 
field from a N-body simulation, that PIZA performs better than other methods such as those based on a Eulerian 
formulation of the Zel'dovich approximation (discussed at the end of Section II). 




FIG. 3. The lack of uniqueness in the results of two runs using a stochastic algorithm to solve the assignment problems. 
In the stochastic algorithm, based on pair-interchange, one searches for, what is frequently referred to in pure mathematics 
as, a monotonic map instead of a true cyclic monotone map. The red and blue points (stars and boxes respectively) are the 
perfectly-reconstructed Lagrangian positions using the same stochastic code with two different random seeds. The outputs of 
the two runs do not coincide, meaning that the reconstructed Lagrangian positions (and hence peculiar velocities) depend on 
the initial random assignment and on the random pair interchange. The lack of uniqueness in this case is superficial and a 
shortcoming of the chosen numerical method. Such difficulties do not arise when deterministic algorithms are used to solve the 
assignment problem. 

There are various deterministic algorithms which guarantee that the optimal assignment is found. An example is 
an algorithm written by Henon (Henon 1995), demonstrated in Fig. 4. In this approach, a simple mechanical device 
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is defined and then simulated which solves the assignment problem. The device acts as an analog computer: the 
numbers entering the problem are represented by physical quantities and the equations are replaced by physical laws. 

One starts with N columns Bs (which represent the Lagrangian positions of the particles) and N rows, ^4s (which 
represent the Eulerian positions of the particles). On each row there are N studs whose lengths are directly related 
to the distances between that row and each column. The columns are given negative weights of —1 and act as floats 
and the rows are given weights of 1. 

The potential energy of the system shown in Fig. 4, within an additive constant, is 

i 3 

where is the height of the row A, t and (3j is the height of the column Bj, since all rows and columns have the same 
weight (1 and —1 respectively). 

Initially, all rods are maintained at a fixed position by two additional rods P and Q with the row Ai above column 
Bj, so that there is no contact between the rows and the studs. Next, the rods are released by removing the holds P 
and Q and the system starts evolving: rows go down and columns go up and the contacts are made with the studs. 
Aggregates of rows and columns are progressively formed. As new contacts are made, these aggregates are modified 
and thus a complicated evolution follows which is graphically demonstrated with a simple example in Fig. 5. One 
can then show that an equilibrium where the potential energy (19) of the system is minimum will be reached after a 
finite time. It can then be shown (Henon 1995) that if the system is in equilibrium and the column Bj is in contact 
with row Ai then the force /y is the optimal solution of the assignment problem. When /y is equal to one there is 
an assignment between rows and columns and when is equal to zero there is no assignment. The potential energy 
of the corresponding equilibrium is equivalent to the total cost of the optimal solution. 



Henon Analog Computer (1991) 




FIG. 4. Henon's mechanical device which solves the assignment problem. The device acts as an analog computer. The rows 
Ai remain parallel to the y axis, are constrained to move vertically and have positive weights. The columns Bj remain parallel 
to the x axis, can only move vertically and have negative weights. Vertical studs are placed on the columns, in such a way that 
each stud enforces a minimal distance between row Ai and column Bj. Initially all rods are kept at fixed positions by stops P 
and Q. Then the rods are released by removing the stops Q and P and the system starts evolving. Rows go down and columns 
go up and aggregates of rows and columns are made. Thus a complex evolution takes place. The rules for the formation of 
aggregates are demonstrated by a simple example in the figure that follows. 
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FIG. 5. A step-by-step (a to 1) progress of Henon's algorithm on a simple example with three initial and final positions 
(columns and rows) is shown. The table on the top left shows the values of the costs (distances between rows and columns). 
When executing the algorithm by hand, it is convenient to keep track of the distances between the rows and the studs, i.e. the 
quantities 7y = a, — f3j — Cij where on is the variable height of the row Ai and f3j is the variable height of the column Bj. A 
graph of the initial state is made with the first contact being made between the second row and second column which have the 
largest cost (note that the code originally written by Henon, in fact, finds the maximum cost. For our purpose of finding the 
minimum cost, one can just simply subtract the matrix elements dj from a large number). Thus, at this point the entries in 
the 7 matrix change since now the second row and column are in contact and hence C22 = 0. Obviously the distances between 
all other rows and columns should also be modified. The second matrix shows the new distances and automatically a contact 
is made between third row and third column whose separation is now also zero. Since the second and third rows cannot move 
now, the next contact is made between the first column and the second row and the break occurs where the total force on the 
branch is weakest. Since this is where the second row meets the support Q, part of the Q tree is captured by the P tree, as 
demonstrated. The next contact can now only be made between the first row and the first column and the break occurs at 
the weakest branch. The equilibrium position is now reached where each row is supported by one column. In the final optimal 
assignment column 1 is attached to row 1, column 2 goes to row 2 and column 3 is associated to row 3. For this simple exercise 
one can easily see that this procedure achieves the maximum cost (which in this example is 21). 



V. TEST OF MONGE— AMPERE— KANTOROVICH (MAK) RECONSTRUCTION WITH N-BODY 

SIMULATIONS 

Thus, in our reconstruction method, the initial positions of the particles are uniquely found by solving the assignment 
problem. This result was based on our reconstruction hypothesis. We could test the validity of our hypothesis by direct 
comparison with numerical N-body simulations which is what we shall demonstrate later in this section. However, 
it is worth commenting briefly on the theoretical, observational and numerical justifications for our hypothesis. It is 
well-known that the Zel'dovich approximation (Zel'dovich 1970) works well in describing the large-scale structure of 
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the Universe. In the Zel'dovich approximation particles move with their initial velocities on inertial trajectories in 
appropriately redefined coordinates. It is also known that the irrotationality of the particle displacements (expressed 
in Lagrangian coordinates) remains valid even at the second-order in the Lagrangian perturbation theory (Moutarde 
et al. 1991; Buchert 1992; Munshi et al. 1994; Catelan 1995). This provides the theoretical motivation for our first 
hypothesis. The lack of severe multistream regions is confirmed by the geometrical properties of the cosmological 
structures. In the presence of significant multistreaming such as the one that occurs in Zel'dovich approximation, one 
could not expect the formation of long-lived structures such as filaments and walls which is observed in numerical N- 
body simulations. In the presence of significant multistreaming these structures would form and would smear out and 
disappear rapidly. This is not backed by numerical simulations. The latter show the formation of shock- like structures 
well-described by Burgers model (a model of shock formation and evolution in a compressible fluid) , which has been 
extensively used to describe the large-scale structure of the Universe (Gurbatov & Saichev 1984; Gurbatov, Saichev 
& Shandarin 1989; Shandarin & Zel'dovich 1989; Vergassola et al. 1994). The success of Burgers model indicates 
that a viscosity-generating mechanism operates at small scales in collisionless systems resulting in the formation of 
shock-like structures rather than caustic-like structures. 

In spite of this evidence in support of our reconstruction hypothesis, one needs to test it before applying it to 
real galaxy catalogues. We have tested our reconstruction against numerical N-body simulation. We ran a ACDM 
simulation of 128 3 dark matter particles, using the adaptive P 3 M code HYDRA (Couchman et al. 1995). Our 
cosmological parameters are f2 m = 0.3, JIa = 0.7, h = 0.65, &g = 0.9 and a box size of 200Mpc/h. We took two 
sparse samples of 17, 178 and 100, 000 particles corresponding to grids of 6Mpc and 3Mpc respectively, at z = and 
placed them initially on a uniform grid. For the 17, 178 particle reconstruction, the sparse sampling was done by 
first selecting a subset of 32 3 particles on a regular Eulerian grid and then selecting all the particles which remained 
inside a spherical cut of radius half the size of the simulation box. The sparse sample of 100, 000 points was made 
randomly. An assignment algorithm, similar to that of Henon described in the previous section, is used to find the 
correspondence between the Eulerian and the Lagrangian positions. The results of our test are shown in Fig. 6. 




Simulation coordinate 

FIG. 6. Test of MAK reconstruction of the Lagrangian positions, using a ACDM simulation of 128 3 particles in a box of size 
200 Mpc 3 //i 3 . In the scatter plot, the dots near the diagonal are a scatter plot of reconstructed initial points versus simulation 
initial points for a grid of size 6 Mpc/h with about 20,000 points. The scatter diagram uses a quasi-periodic projection coordinate 
q = (q x + s/2q v + \fiq z )/(l + \/2 + v3) which guarantees a one-to-one correspondence between q values and points on the 
regular Lagrangian grid. The upper left inset is a histogram (by percentage) of distances in reconstruction mesh units between 
such points; the first bin corresponds to perfect reconstruction; the lower-inset is a similar histogram for reconstruction on a 
finer grid of about 3 Mpc/h using 100,000 points. With the 6 Mpc/h grid 62% of the points, and with 3 Mpc/h grid more 
than 50% of the points, are assigned perfectly. 

It is instructive to compare our results presented in Fig. 6 with those obtained by a stochastic PIZA-type algorithms. 
Fig. 7 demonstrates such a comparison. We see that not only the stochastic algorithms do not guarantee uniqueness, 
as demonstrated in Fig. 3, but also the number of exactly reconstructed points by these algorithms are much below 
those obtained by our deterministic algorithm. 
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FIG. 7. A comparison of a PIZA-type, stochastic method, versus our MAK, deterministic method of solving the assignment 
problem is shown. The outputs of the simulation used for Fig. 6 have also been used here. In the stochastic method, almost 
one billion pair interchange are made. At which point the cost-reduction cannot be reduced anymore. However, even with such 
a large number of interchanges, the number of exactly reconstructed points is well below that achieved by MAK. The upper 
sets are the scatter plots of reconstructed versus simulation Lagrangian positions for PIZA (left top set) and for MAK (right 
top set). The lower sets are histograms of the distances between the reconstructed Lagrangian positions and those given by the 
simulation. The bin sizes are taken to be slightly less than one mesh. Hence, all the points in the first (darker) bin correspond 
to those which have been exactly reconstructed. Using PIZA, we achieve a maximum of about 5000 out of the 17, 178 points 
exactly-reconstructed positions whereas with MAK this number reaches almost 8000 out of 17, 178 points. Note that, for the 
sole purpose of comparing MAK with PIZA it is not necessary to account for periodicity corrections, which would improve both 
performances equally. Accounting for the periodicity improves exactly-reconstructed MAK positions to almost 11,000 points 
out of 17, 178 points used for reconstruction, as shown in the upper inset of Fig. 6. Starting with an initial random cost of about 
200 million (Mpc/h) 2 (5000 in our box unit which runs from zero to one), after one billion pair interchange, a minimum cost of 
about 1,960,000 (Mpc/h) 2 (49 in our box unit) is achieved. Continuing to run the code on a 2 GHz DEC alpha workstation, 
consuming almost a week of CPU time, does not reduce the cost any further. With the MAK algorithm, the minimum cost 
is achieved, on the same machine, in a few minutes. The cost for MAK is 1,836,000 (Mpc/h) 2 (45.9 in box units) which is 
significantly lower than the minimum cost of PIZA. Considering that the average particle displacement in one Hubble time is 
about 10 Mpc/h (about 1/20 of the box size) this discrepancy between MAK and PIZA costs is rather significant. 



When reconstructing from observational data, in redshift space, the galaxies positions are displaced radially by an 
amount proportional to the radial component of the peculiar velocity in the line of sight. Thus for real catalogues, the 
cost minimisation need be performed using redshift space coordinates (Valentine et al. 2000). The redshift position 
of a galaxy is given by 

s = x + s ( ^ ■§ j , (20) 



H 

where under our assumption of irrotationality and convexity of the Lagrangian map and to first order in Lagrangian 
perturbation theory, 

x = 0H{x - q) (21) 

and — f(Q)/b with /(fi) = dlnS/dliia = f2^ 6 and the bias factor is <5 ga i ax i CS = bS mass . The quadratic cost function 
can then be easily found in terms of redshift space coordinates as follows 

(x-q) 2 = (s-q) 2 -^±^((s-q).s) 2 . (22) 
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Using MAK, based now on the modified cost (17) we can then find the corresponding q for each s and hence find the 
peculiar velocities. Indeed, using (20) 



x 
H 



ft?,; 6 



(23) 



Note that in this approach the motion of the local group itself is ignored, which however should also be accounted 
for (Taylor & Valentine 1999). Furthermore, the effect of catalogue selection function can be handled by giving each 
galaxy an effective-mass inversely proportional to the catalogue selection function (Nusser and Branchini 2000). We 
have also assumed that galaxies are unbiased tracers of the mass distribution. However, the relative bias between 
galaxies of different luminosities and of different morphological types indicate that galaxies are likely to be biased 
tracers of mass distribution (e.g. Loveday et al. 1995). Biasing can be taken into account in a similar manner to the 
catalogue selection function (Nusser and Branchini 2000). The net effect of biasing for our scheme is that it changes 
the effective- mass associated to the galaxies which can be easily incorporated in our cost function (17), by multiplying 
the quadratic function inside the sum by the effective-mass rrij associated with each Eulerian position Xj . 

We have also performed MAK reconstruction with the redshift-modified cost function which led to somewhat 
degraded results but at the same time provided an approximate determination of peculiar velocities (see Fig. 8). A 
more accurate determination of peculiar velocities can be made using second-order Lagrangian perturbation theory 
(e.g. Catelan 1995 and refs. therein). 




0.3 0.6 
Simulation coordinate 



FIG. 8. Reconstruction test in redshift space with the same data as that used for real-space reconstruction tested in the 
upper left histogram of Fig. 6. The velocities are smoothed over a sphere of radius 2 Mpc/h. The observer is taken to be at 
the centre of the simulation box. The Lagrangian reconstructed points are plotted against the simulation Lagrangian positions 
using the quasi-periodic projection coordinates used for the scatter plot of Fig. 6. The histogram corresponds to the distances 
between the reconstructed and the simulation Lagrangian positions for each Eulerian position. The bins of the histogram are 
smaller than one mesh and the first one corresponds to exactly-reconstructed Lagrangian positions. We obtain 43% perfect 
reconstruction. 



We have also compared the redshift-space MAK reconstruction to real space MAK reconstruction, which shows 
that almost 50% of exactly-reconstructed positions correspond to the same points (see Fig. 9). This test shows that 
MAK is robust against systematic errors. 
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FIG. 9. Comparison of redshift versus real space reconstruction allows a test of the robustness of MAK reconstruction against 
systematic errors. The same data as those used for Fig. 8 has been used to obtain the above result. In both real and redshift 
space most of the exactly reconstructed points almost 50% of the points, which fall in the first bin of the histogram, correspond 
to the same particles. 

Therefore the question that one asks at this point is: where does reconstruction perform poorly? 




FIG. 10. A thin slice (26 Mpc/h) of the box (of sides 200 Mpc/h) of the 128 3 , ACDM simulation the same as that used for 
Fig. 6 is shown. Out of the 100, 000 points which have been used for reconstruction more than half are exactly reconstructed. 
The points which fail exact reconstruction are highlighted (yellow, brighter points). The plot verifies that these points lie 
mainly in the overdense regions where we do not expect our reconstruction hypothesis to hold well. 

Fig. 10 highlights where exact reconstruction fails. We see that exact reconstruction is not achieved in particular 
in the dense regions. Achieving reconstruction at small scales remains a subject of on-going research. As long as 
multistreaming effects are unimportant, that is above ~ 1 Mpc, uniqueness of the reconstruction is guaranteed. 
Approximate algorithms capturing such highly nonlinear effects are now being developed. 
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VI. CONCLUSION 



An accurate reconstruction allows us to test cosmological models by simulating the evolution starting from the 
reconstructed primordial state and comparing it to observations. Efficient and unique reconstruction would allows us 
to determine the peculiar velocities of a large number of galaxies, using their positions in redshift catalogues. One 
could use reconstruction to search for the presence of primordial non-gaussianity and examine the self-consistency of 
cosmological hypotheses such as the choice of the global cosmological parameters and the assumed biasing scheme. By 
obtaining a point-by-point reconstruction of the specific realisation that describes the observed patch of our Universe, 
one could separate the universal properties and the influence of the large-scale environment on the galaxy formation 
process. 

Many previous reconstruction techniques have cither suffered from the lack of uniqueness or limited validity which 
could not be extended beyond the linear Eulerian regime. Our reconstruction scheme overcomes such shortcomings. 

We have shown that under suitable hypotheses, there is a unique transformation from the present positions x of 
galaxies (taken as mass tracers) to their respective initial positions q. Our reconstruction hypothesis first assumes 
that the Lagrangian map x — ► q can be written in terms of a potential x = V q $(q). This hypothesis is supported by 
numerical N-body simulations and is valid up to second order Lagrangian perturbation theory. The second assumption 
is the convexity of the potential $(q), a consequence of which is the absence of multi-streaming: for almost any Eulerian 
position, there is a single Lagrangian antecedent. As is well-known, the Zel'dovich approximation leads to caustics 
and multistreaming. This can be modified in various ways, for example by the introduction of a mock viscosity as is 
done in the adhesion model (Gurbatov and Saichev 1984, Gurbatov et al. 1989, Shandarin and Zel'dovich 1994). The 
latter which leads to shocks rather than caustics, is known to have a convex potential (Vergassola et al. 1994) and to 
be in a better agreement with numerical N-body simulations (Weinberg and Gunn 1990). 

By testing our MAK reconstruction against numerical N-body simulations both in real and redshift spaces, we have 
demonstrated that our reconstruction hypothesis works well down to the scales comparable to the size of collapsed 
structures, below which the hydrodynamical description of gravitational structure formation ceases to be meaningful 
(at scales larger than 6 Mpc/h we achieve more than 62% exact reconstruction). Thus, our reconstruction is rather 
suitable for large-scale galaxy redshift surveys. 
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thanks the Indo-French Center for the Promotion of Advanced Research (IFCPAR 2404-2). A.S. is supported by A 
CNRS Henri Poincare fellowship and RFBR 02-01-01062. R.M. is supported by the European Union Marie Curie 
Fellowship HPMF-CT-2002-01532. 
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